Skip to content

Modular handshake #494

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 34 commits into from
Closed

Conversation

arik-so
Copy link
Contributor

@arik-so arik-so commented Feb 12, 2020

Refactor peer channel encryptor into a whole separate, isolated module and abstract away most of the internals of handshake handling from the peer handler

Copy link
Collaborator

@TheBlueMatt TheBlueMatt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly just skimmed it but left some notes for things I came across. The high level API looks good (modulo some "we should move, like, everything out of ln::" thinking, but I am worried about overuse of Vec in a bunch of places (malloc is more expensive in surprising ways than you think, really), and needs docs.

@@ -0,0 +1,173 @@
use ln::peers::{chacha, hkdf};

/// Returned after a successful handshake to encrypt and decrypt communication with peer nodes
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add more details about how you'd use this, what the requirements are for use, etc? Same goes for pretty much everything pub in this PR.

@@ -0,0 +1,34 @@
use util::chacha20poly1305rfc::ChaCha20Poly1305RFC;

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this in ln::peers? It seems to be pure crypto functions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because the AEAD-based encryption methods are only used for handshakes and peer message encryption IIRC, and not for the onion construction.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, but its also a pure-crypto primitive. I guess my preference is for such things (even if it implements a lightning protocol crypto primitive) to be in some kind of crypto module.

let length = buffer.len() as u16;
let length_bytes = length.to_be_bytes();

let encrypted_length = chacha::encrypt(&self.sending_key, self.sending_nonce as u64, &[0; 0], &length_bytes);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oof. Lets avoid allocating a Vec for two bytes somehow.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure this is the right place to shave off memory usage at the expense of legibility? I'm a bit hesitant to do the whole decrypt in place approach.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dont think it changes legibility, and its not a memory usage question, its a question of heap fragmentation and significant performance penalty to hit the heap. I'm dubious you can't make it readable without using a Vec.

}

pub(super) fn read(&mut self, data: &[u8]) {
let mut read_buffer = if let Some(buffer) = self.read_buffer.take() {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ooooh, TIL. There are a bunch of places where I will replace that.

}

impl Act {
pub fn serialize(&self) -> Vec<u8> {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we name this something else? I almost responded with "why the hell would you ever want to serialize an Act for storage?" before I realized it meant "write act to a vec to send to a peer" (also, can we not return a slice here?).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps to_vec? We could return a slice, but would have to mark its lifetime as linked to the act object's.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's wrong with using lifetimes? The callsite can chose to store it in a Vec or copy to some local buffer if they wont want it.

@@ -0,0 +1,398 @@
use bitcoin_hashes::{Hash, HashEngine};
use bitcoin_hashes::sha256::Hash as Sha256;
use rand::{Rng, thread_rng};
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't support rand because it a) often breaks MSRV, and b) doesn't support lots of platforms (read:embedded) that we do, where RNG access is via some proprietary API. Any random values you want need to be passed in from the user.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I'll try to refactor it to have the user pass in entropy.

@arik-so arik-so requested a review from jkczyz February 12, 2020 22:28
@arik-so
Copy link
Contributor Author

arik-so commented Feb 13, 2020

Could you leave moving macros and reducing indentations to a separate PR? For the latter, you can introduce scopes if needed. This will make reviewing the PR much easier as the diff will be smaller. It will also help me avoid some nasty merge conflicts when rebasing. :)

I'm afraid not, because some versions of Rust are confused about variable scopes with macro invocation.

@arik-so
Copy link
Contributor Author

arik-so commented Feb 13, 2020

But for what it's worth, the only file that I am really modifying that existed prior is peer_handler.rs. And this diff is gonna impact it pretty severely one way or another.

@jkczyz
Copy link
Contributor

jkczyz commented Feb 13, 2020

Could you leave moving macros and reducing indentations to a separate PR? For the latter, you can introduce scopes if needed. This will make reviewing the PR much easier as the diff will be smaller. It will also help me avoid some nasty merge conflicts when rebasing. :)

I'm afraid not, because some versions of Rust are confused about variable scopes with macro invocation.

Actually, I may have been a bit confused here. The primary cause of all unrelated diffs is the indentation change. Could you introduce scopes to prevent this? FYI, this comment was suppose to be on do_read_event.

@@ -501,266 +531,214 @@ impl<Descriptor: SocketDescriptor, CM: Deref> PeerManager<Descriptor, CM> where
}
}

macro_rules! insert_node_id {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why remove this? We would no longer insert into the map without it, AFAICT. This breaks other code that needs the map to look up peers by id.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

None of the unit tests are failing, but I will re-add it to the graph, and then think of unit tests for this in a separate diff

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this code isn't well tested but most (all?) of the public methods depend on that map.

@arik-so
Copy link
Contributor Author

arik-so commented Feb 13, 2020

Could you leave moving macros and reducing indentations to a separate PR? For the latter, you can introduce scopes if needed. This will make reviewing the PR much easier as the diff will be smaller. It will also help me avoid some nasty merge conflicts when rebasing. :)

I'm afraid not, because some versions of Rust are confused about variable scopes with macro invocation.

Actually, I may have been a bit confused here. The primary cause of all unrelated diffs is the indentation change. Could you introduce scopes to prevent this? FYI, this comment was suppose to be on do_read_event.

Honestly, I think the indentation change might be because I moved the macro definitions. But I stand by the moves :)

let mut features = InitFeatures::supported();
if self.message_handler.route_handler.should_request_full_sync(&peer.their_node_id.unwrap()) {
features.set_initial_routing_sync();
match message {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you insert three scopes around this, the diffs become saner.

Copy link
Contributor

@jkczyz jkczyz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only looked at peer_handler.rs so far, though I do have a good sense of the other code from looking at it earlier. I like the direction this is moving in!

Could you squash some of these commits? See guidelines about atomic commits. Some squashes could wait until the end of the review for sure. But at very least separate commits for making it compile with Rust 1.22.0 are not relevant to the code review. Likewise for the separate commits used to remove the extraneous diffs.

Also, please include the why not just the what in your commit messages (when applicable) and PR description for benefit of reviewers and posterity. See the best practices for writing commit messages that are linked in the guidelines.

@@ -596,33 +591,32 @@ impl<Descriptor: SocketDescriptor, CM: Deref> PeerManager<Descriptor, CM> where
log_trace!(self, "Received message of type {} from {}", message.type_id(), log_pubkey!(peer.their_node_id.unwrap()));

// Need an Init as first message
if let wire::Message::Init(_) = message {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry to be nit-picky about this, but we need to aim for atomic commits for the reasons mentioned in our contributing guidelines. All the changes from here to the end of the method appear unrelated to your previous changes and aren't necessary.

I'm all improving formatting, but I'd prefer if this were done at one time when we don't think it will be disruptive. Doing so then will also make the blame history easier to reason about.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm sorry, I can't see these changes anymore due to squashing. Was it an indentation change?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got most but still a few unrelated changes at lines 591 to 593 and line 755.

@@ -501,266 +531,214 @@ impl<Descriptor: SocketDescriptor, CM: Deref> PeerManager<Descriptor, CM> where
}
}

macro_rules! insert_node_id {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this code isn't well tested but most (all?) of the public methods depend on that map.

let mut handshake = PeerHandshake::new(&self.our_node_secret, &self.get_ephemeral_key());
handshake.make_inbound();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than calling a separate method after new, how about replacing new with make_inbound and make_outbound constructors?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or rather new_inbound and new_outbound, sure


if let &mut PeerState::Handshake(ref mut handshake) = &mut peer.encryptor {
let (response, conduit, remote_pubkey) = handshake.process_act(&peer.pending_read_buffer, None).unwrap();
peer.pending_read_buffer.drain(..); // we read it all
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment looks like a code smell. The drain introduces a dependency between PeerManager and PeerHandshake. That is, one module needs to know how the other is implemented. Could you remove this coupling by passing the data to process_act using move semantics rather than by reference?

That said, I need to consider how decrypt is used. We can discuss further offline if you'd like.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I've got an idea for this. Add a method to handshake to read only one message, but store the buffer

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh wait, that's already happening

}

if let Some(conduit) = conduit_option {
// Rust 1.22 does not allow assignment in a borrowed context, even if mutable
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I follow what you were trying to do as expressed by this comment. What statement wasn't compiling for you?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This statement could have been done inside the big if clause preceding these single line assignments.

macro_rules! encode_and_send_msg {
($msg: expr) => {
{
log_trace!(self, "Encoding and sending message of type {} to {}", $msg.type_id(), log_pubkey!(peer.their_node_id.unwrap()));
peer.pending_outbound_buffer.push_back(peer.channel_encryptor.encrypt_message(&encode_msg!(&$msg)[..]));
// we are in a context where conduit is known
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you use an assert instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it doesn't make sense because the other variables are assumed to be known, too. Once this is refactored from a macro into a regular method, it will solve itself.

Comment on lines 486 to 489
if let Some(key) = remote_pubkey_option {
peer.their_node_id = Some(key);
}

if let Some(conduit) = conduit_option {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't all this be done within the PeerState::Handshake block above? Then you wouldn't need conduit_option and remote_pubkey_option there with the de-structured moves in the if let statements above. You would just place this code there instead.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not in rust 1.22

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, yeah for conduit at least. At very least you shouldn't have destruct the Option when assigning to conduit_option. You can simply unconditionally assign:

conduit_option = conduit;

Comment on lines -822 to +813
peer.pending_outbound_buffer.push_back(peer.channel_encryptor.encrypt_message(&encode_msg!(msg)));
if let PeerState::Connected(ref mut conduit) = peer.encryptor {
peer.pending_outbound_buffer.push_back(conduit.encrypt(&encode_msg!(msg)));
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we have a peer in node_id_to_descriptor, it should also have a conduit. Any chance this can be done within get_peer_for_forwarding so the repetition across the method could be removed? We probably want to assert this as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Absolutely, though here too, I think that it will be much clearer once get_peer_for_forwarding is refactored from being a macro, which I strongly believe it doesn't need to be.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't whether it is a macro or method orthogonal to whether the duplication should be removed?

FWIW, my refactor touches this method so avoiding all the merge conflict where this change is duplicate (~16 places) would be ideal. :)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tend to agree the redundancy of all these calls suck. Happy for y'all to hash out which PR it gets fixed in, though.

Comment on lines +6 to +17
/// Returned after a successful handshake to encrypt and decrypt communication with peer nodes.
/// It should not normally be manually instantiated.
/// Automatically handles key rotation.
/// For decryption, it is recommended to call `decrypt_message_stream` for automatic buffering.
pub struct Conduit {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The word "conduit" is another name for "channel", which I assume you are trying not to overload. That said, nothing about this struct's behaviors fit the definition of those words. That is, the struct is not responsible for transferring (flowing) data from a source to a destination; the caller does that. Rather, it is simply responsible for encrypting and decrypting data that is transferred.

Further, the fields of the struct are essentially divided into analogous "sending" and "receiving" fields used for encrypting and decrypting respectively. And these fields are disjoint. Therefore, I'd recommend dividing this struct into two structs called Encryptor and Decryptor. The common associated functions can be moved to the module-level. Then, each struct has a single responsibility and there is no need to distinguish between types of keys and nonces (and mistakenly using the wrong one).

Then your PeerState enum can become:

enum PeerState {
    Authenticating(PeerHandshake),
    Connected(Encryptor, Decryptor),
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes perfect sense and I would love to do that, though what would you call the module?

Copy link
Contributor

@jkczyz jkczyz Feb 19, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about simply "encryption.rs"? Or you could break them into "encryptor.rs" and "decryptor.rs" with key rotation utilities living elsewhere.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe. But I'd reserve it for a separate PR

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I propose to use name Transcoder

Comment on lines 11 to 15
pub(crate) sending_key: [u8; 32],
pub(crate) receiving_key: [u8; 32],

pub(crate) sending_chaining_key: [u8; 32],
pub(crate) receiving_chaining_key: [u8; 32],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you make constants for these? Or reuse some from secp256k1::constants?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These aren't constants as much as they are types, but definitely.

let length = buffer.len() as u16;
let length_bytes = byte_utils::be16_to_array(length);

let mut ciphertext = vec![0u8; 18 + length as usize + 16];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Likewise, with regard to constants.

Comment on lines 45 to 76
/// Add newly received data from the peer node to the buffer and decrypt all possible messages
pub fn decrypt_message_stream(&mut self, new_data: Option<&[u8]>) -> Vec<Vec<u8>> {
let mut read_buffer = if let Some(buffer) = self.read_buffer.take() {
buffer
} else {
Vec::new()
};

if let Some(data) = new_data {
read_buffer.extend_from_slice(data);
}

let mut messages = Vec::new();

loop {
// todo: find way that won't require cloning the entire buffer
let (current_message, offset) = self.decrypt(&read_buffer[..]);
if offset == 0 {
break;
}

read_buffer.drain(0..offset);

if let Some(message) = current_message {
messages.push(message);
} else {
break;
}
}

messages
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't look like it used outside of a test. Let's leave it out until it is needed. It would be better to review when the use case is present.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a better idea. It will be in the next revision.

return (None, 0);
}

let encrypted_length = &buffer[0..18]; // todo: abort if too short
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like this TODO was addressed above, right?


fn increment_nonce(nonce: &mut u32, chaining_key: &mut [u8; 32], key: &mut [u8; 32]) {
*nonce += 1;
if *nonce == 1000 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a constant for this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, you meant the 1,000

Comment on lines 183 to 196
for _ in 0..1002 {
let encrypted_message = encrypted_messages.remove(0);
let mut decrypted_messages = remote_peer.decrypt_message_stream(Some(&encrypted_message));
assert_eq!(decrypted_messages.len(), 1);
let decrypted_message = decrypted_messages.remove(0);
assert_eq!(decrypted_message, hex::decode("68656c6c6f").unwrap());
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like it is testing a different behavior so it should probably be a different test.

Comment on lines 176 to 181
assert_eq!(encrypted_messages[0], hex::decode("cf2b30ddf0cf3f80e7c35a6e6730b59fe802473180f396d88a8fb0db8cbcf25d2f214cf9ea1d95").unwrap());
assert_eq!(encrypted_messages[1], hex::decode("72887022101f0b6753e0c7de21657d35a4cb2a1f5cde2650528bbc8f837d0f0d7ad833b1a256a1").unwrap());
assert_eq!(encrypted_messages[500], hex::decode("178cb9d7387190fa34db9c2d50027d21793c9bc2d40b1e14dcf30ebeeeb220f48364f7a4c68bf8").unwrap());
assert_eq!(encrypted_messages[501], hex::decode("1b186c57d44eb6de4c057c49940d79bb838a145cb528d6e8fd26dbe50a60ca2c104b56b60e45bd").unwrap());
assert_eq!(encrypted_messages[1000], hex::decode("4a2f3cc3b5e78ddb83dcb426d9863d9d9a723b0337c89dd0b005d89f8d3c05c52b76b29b740f09").unwrap());
assert_eq!(encrypted_messages[1001], hex::decode("2ecd8c8a5629d0d02ab457a0fdd0f7b90a192cd46be5ecb6ca570bfc5e268338b1a16cf4ef2d36").unwrap());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this is testing two different behaviors: chaining and key rotation? Ideally, tests should only test one behavior. Could we split this up? I realize this is the test data from BOLT 4, so happy to hear the argument if you think it should be one big test.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right, it is testing both. Arguably the key rotation is part of the encryption. I can split it up into two tests if you like, but in a separate commit.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

eb6a371

Do you test decryption in itself ? +1 for splitting into multiple tests and documenting what exactly is tested.

@@ -0,0 +1,38 @@
/// Wrapper for the first act message
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to get a better sense as to when separate files should be used for some of these smaller structs. In particular, handshake/acts.rs, handshake/hash.rs, and handshake/states.rs are used almost exclusively in handshake/mod.rs. And they constitute less than 100 lines total. What criteria did you used to split these out? Do they deserve separate sub-modules when the structs within them are themselves named using their sub-module name and are not independently tested?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Frankly, just easier navigation through the components in the file system.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given these are small and either are implementation details or form the public interface of handshake/mod.rs, I'd prefer if they were all in the same file. Otherwise, navigation between files is a bit excessive, IMHO.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer to not stuff the handshake logic with all the accessory structs. I would be open to having one separate file where both the enums/states and the acts would live?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see them as accessories but rather core parts of the handshake module's interface (acts.rs) and implementation (states.rs). The former requires the user to include two things when using the interface. The latter requires the reader to reference another file when reading the implementation details.

I'd concede that an argument can be made for keeping hash.rs as a separate file since it is encapsulating functionality.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For what it's worth, so does the Act enum

let remote_pubkey_vec = chacha::decrypt(&act_three_expectation.temporary_key, 1, &hash.value, &tagged_encrypted_pubkey)?;
let mut remote_pubkey_bytes = [0u8; 33];
remote_pubkey_bytes.copy_from_slice(remote_pubkey_vec.as_slice());
// todo: replace unwrap with handleable error type
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you address this TODO?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, in a subsequent commit


let mut ephemeral_public_key_bytes = [0u8; 33];
ephemeral_public_key_bytes.copy_from_slice(&act_bytes[1..34]);
// todo: replace unwrap with handleable error type
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you address this TODO?


/// Process act dynamically
/// The role must be set before this method can be called
pub fn process_act(&mut self, input: &[u8], remote_public_key: Option<&PublicKey>) -> Result<(Vec<u8>, Option<Conduit>, Option<PublicKey>), String> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some idle thoughts on this interface (process_act) and that of process_act_two and process_act_three:

Rather than returning tuples of Options in process_act and the non-Option values in the other two methods, could they (i.e., PublicKey and Conduit) instead be set as fields in the appropriate act states (without using Option)? Then accessor methods on PeerHandshake could be provided for retrieving them (using Option). That way the code is a bit more self-documented and the return values of the act processing methods would be simpler (i.e., just returning the next Act to send).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, the act states are only what's necessary for the next step of the handshake. I think the handshake can be thought of as a chemical reaction, where there is an educt, a product, as well as some external inputs and byproducts. The handshake states are educts and products, and the conduit is a byproduct.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are the acts and the remote public key in this analogy?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussed a bit offline. One idea is to have PeerHandshake initialized with the optional remote public key, so that it is not both an input and an output of process_act. And then have an accessor for accessing it. Similarly with the conduit.

Then the only return value would be the act data which could even be the Act enum instead given it simply wraps the vector, which would make the interface more explicit.

fix last lint doc issue?

make the &Some matches explicit for Rust 1.22

appease Rust 1.22 some more with ampersandery

appease Rust 1.22 by using byte_utils for endianness functionality

appease Rust 1.22 by using byte_utils and ref in match arms

experimenting some more with ref matching for Rust 1.22

might Rust 1.22 finally work? Rearranged a lot of borrowing locations and macro behavior and definition locations

fix bug that was kindly caught by fuzz tests

fix fuzz test

improve error messaging

the different rust versions are a balancing act, and I can't juggle
…stants, and type definitions

respond to Jeff's comments and add node pubkeys to map

simplify review by means of 3 scopes

adjust scope indentation

resolve merge conflicts

respond to more of Jeff's comments

constantify key rotation index for conduit
… handshake method `process_act` across both call sites from peer_handler.rs.
@arik-so arik-so self-assigned this Feb 19, 2020
@arik-so
Copy link
Contributor Author

arik-so commented Feb 20, 2020

Travis seems to have forgotten about this


/// Decrypt a single message. Buffer is an undelimited amount of bytes
fn decrypt(&mut self, buffer: &[u8]) -> (Option<Vec<u8>>, usize) { // the response slice should have the same lifetime as the argument. It's the slice data is read from
if buffer.len() < 18 {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

eb6a371

Can you constify or document here than a short read less than encrypted_message_length+MAC is invalid ?

length_bytes.copy_from_slice(length_vec.as_slice());
let message_length = u16::from_be_bytes(length_bytes) as usize;

let message_end_index = message_length + 18; // todo: abort if too short
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

eb6a371

Is 16-byte of MAC of the Lightning message counted in message length ? I'm not sure as l=len(m) computed before message encryption so shouldn't be 18 + message_length + 16 here?

Comment on lines 176 to 181
assert_eq!(encrypted_messages[0], hex::decode("cf2b30ddf0cf3f80e7c35a6e6730b59fe802473180f396d88a8fb0db8cbcf25d2f214cf9ea1d95").unwrap());
assert_eq!(encrypted_messages[1], hex::decode("72887022101f0b6753e0c7de21657d35a4cb2a1f5cde2650528bbc8f837d0f0d7ad833b1a256a1").unwrap());
assert_eq!(encrypted_messages[500], hex::decode("178cb9d7387190fa34db9c2d50027d21793c9bc2d40b1e14dcf30ebeeeb220f48364f7a4c68bf8").unwrap());
assert_eq!(encrypted_messages[501], hex::decode("1b186c57d44eb6de4c057c49940d79bb838a145cb528d6e8fd26dbe50a60ca2c104b56b60e45bd").unwrap());
assert_eq!(encrypted_messages[1000], hex::decode("4a2f3cc3b5e78ddb83dcb426d9863d9d9a723b0337c89dd0b005d89f8d3c05c52b76b29b740f09").unwrap());
assert_eq!(encrypted_messages[1001], hex::decode("2ecd8c8a5629d0d02ab457a0fdd0f7b90a192cd46be5ecb6ca570bfc5e268338b1a16cf4ef2d36").unwrap());
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

eb6a371

Do you test decryption in itself ? +1 for splitting into multiple tests and documenting what exactly is tested.

}

impl PeerHandshake {
pub fn new(private_key: &SecretKey, ephemeral_private_key: Option<&SecretKey>) -> Self {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

eb6a371

Shouldn't PeerHandshake get a KeysInterface trait and access node key secret through it ? Maybe not in this PR but you should let a TODO about this, all ecdh operations with key acess should be made behind this.

If we do have to do any operation with private key in-memory, we should document somewhere to zero'out the memory after we don't need them anymore, at PeerHandshake destruction I'm not sure if memory is cleaned


let (act_three, mut conduit) = self.process_act_two(ActTwo(act_two_buffer))?;

if self.read_buffer.len() > 0 { // have we received more data still?
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

eb6a371

I find this a bit of a blur between handshake phase and transport phase, we already have the offset with act_length, can't we use this to drive conduit reading in a further step ? And avoid clone.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hear you, but haven't really figured out a nice alternative yet.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nevermind, I figure out after pursing review than I agree it's not that much easy

() => {
match peers.node_id_to_descriptor.entry(peer.their_node_id.unwrap()) {
hash_map::Entry::Occupied(_) => {
log_trace!(self, "Got second connection with {}, closing", log_pubkey!(peer.their_node_id.unwrap()));
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

b71b7ea

nit: precise you're closing both connections in the log_trace

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The log entry is just getting moved; changing it will also require changing the no breakage unit test, which I strongly wish to avoid in such a massive refactor.

@arik-so arik-so force-pushed the modular_handshake branch from 5155970 to d84dcc8 Compare June 15, 2020 18:44
}

if peer.pending_read_buffer_pos == peer.pending_read_buffer.len() {
Copy link

@julianknutsen julianknutsen Aug 24, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

During the handshake pending_read_buffer is initialized to an act length vector filled with 0s. So, until the full act arrives this branch will not be taken. This PR returns an Err if the process_act() doesn't receive the full act and results in a PeerDataProcessingDecision::Disconnect.

#672


let encrypted_length = &buffer[0..TAGGED_MESSAGE_LENGTH_HEADER_SIZE];
let mut length_bytes = [0u8; MESSAGE_LENGTH_HEADER_SIZE];
length_bytes.copy_from_slice(&chacha::decrypt(&self.receiving_key, self.receiving_nonce as u64, &[0; 0], encrypted_length).unwrap());

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Found this bug while adding to the fuzz testing today. It should return an error here in the event of a decryption error instead of panicking. This is a bit involved since the Iterator implementation now needs to return Result objects, but it is doable.

julianknutsen added a commit to julianknutsen/rust-lightning that referenced this pull request Aug 26, 2020
After reviewing lightningdevkit#494 there was design feedback on the use of Vec vs.
arrays in this layer of the code for fragmentation reasons. To get ahead
of the same issues in this new state machine and to limit the behavior
changes from master, remove Vec from the state machine in favor of thin
wrappers around fixed-size arrays.

This patch reintroduces the Act object (a thin wrapper around fixed size
arrays) with some convenience features to make them a bit easier to
pass around and build.

The Handshake code is still note Vec-clean as the encryption code uses
Vecs during encryption, but it is one step closer to passing back slices
to the peer_handler.
julianknutsen added a commit to julianknutsen/rust-lightning that referenced this pull request Aug 26, 2020
After reviewing lightningdevkit#494 there was design feedback on the use of Vec vs.
arrays in this layer of the code for fragmentation reasons. To get ahead
of the same issues in this new state machine and to limit the behavior
changes from master, remove Vec from the state machine in favor of thin
wrappers around fixed-size arrays.

This patch reintroduces the Act object (a thin wrapper around fixed size
arrays) with some convenience features to make them a bit easier to
pass around and build.

The Handshake code is still note Vec-clean as the encryption code uses
Vecs during encryption, but it is one step closer to passing back slices
to the peer_handler.
@ariard
Copy link

ariard commented Aug 27, 2020

@julianknutsen @arik-so How do we move forward with this PR and "Enum Dispatch NOISE State Machine" ?

Should we focus review on this PR first then open a rebased "Enum Dispatch NOISE State Machine" ? Or "Enum Dispatch NOISE State" should supersede it with a new PR ?

@arik-so
Copy link
Contributor Author

arik-so commented Aug 27, 2020

I think the nature of the NOISE PR updating the branch that will then update the modular handshake PR automatically requires that we focus on the NOISE one first.

@julianknutsen
Copy link

I think the nature of the NOISE PR updating the branch that will then update the modular handshake PR automatically requires that we focus on the NOISE one first.

I agree. I think that the end of this is going to look like a "feature branch" represented by https://github.com/arik-so/rust-lightning/tree/modular_handshake and a set of PRs against that branch that address some of the outstanding review comments as well as this handshake state machine to get it into a point where the modular handshake PR is ready to go.

There are still a few outstanding action items from this original PR that I am addressing in future PRs targetting ariks's branch so it seems like vetting code into that branch and then a final PR from his branch into master is the path of least resistance. Still going to be quite a bit of work, but that is just the nature of long-standing dependent PRs.

If there is additional legwork you would like me to do (rebasing, merge PRs, etc) to make it easy just let me know. I know that this wasn't planned work so I want to make it as easy as possible to integrate the changes and additional testing.

@ariard
Copy link

ariard commented Aug 27, 2020

Okay I'm even on this, will review the NOISE PR against Arik's repo so. At least as we track back review context.

julianknutsen added a commit to julianknutsen/rust-lightning that referenced this pull request Sep 9, 2020
After reviewing lightningdevkit#494 there was design feedback on the use of Vec vs.
arrays in this layer of the code for fragmentation reasons. To get ahead
of the same issues in this new state machine and to limit the behavior
changes from master, remove Vec from the state machine in favor of thin
wrappers around fixed-size arrays.

This patch reintroduces the Act object (a thin wrapper around fixed size
arrays) with some convenience features to make them a bit easier to
pass around and build.

The Handshake code is still note Vec-clean as the encryption code uses
Vecs during encryption, but it is one step closer to passing back slices
to the peer_handler.
@julianknutsen
Copy link

Superseded by PR stack ending at #694

julianknutsen added a commit to julianknutsen/rust-lightning that referenced this pull request Sep 15, 2020
After reviewing lightningdevkit#494 there was design feedback on the use of Vec vs.
arrays in this layer of the code for fragmentation reasons. To get ahead
of the same issues in this new state machine and to limit the behavior
changes from master, remove Vec from the state machine in favor of thin
wrappers around fixed-size arrays.

This patch reintroduces the Act object (a thin wrapper around fixed size
arrays) with some convenience features to make them a bit easier to
pass around and build.

The Handshake code is still note Vec-clean as the encryption code uses
Vecs during encryption, but it is one step closer to passing back slices
to the peer_handler.
julianknutsen added a commit to julianknutsen/rust-lightning that referenced this pull request Sep 15, 2020
After reviewing lightningdevkit#494 there was design feedback on the use of Vec vs.
arrays in this layer of the code for fragmentation reasons. To get ahead
of the same issues in this new state machine and to limit the behavior
changes from master, remove Vec from the state machine in favor of thin
wrappers around fixed-size arrays.

This patch reintroduces the Act object (a thin wrapper around fixed size
arrays) with some convenience features to make them a bit easier to
pass around and build.

The Handshake code is still note Vec-clean as the encryption code uses
Vecs during encryption, but it is one step closer to passing back slices
to the peer_handler.
julianknutsen added a commit to julianknutsen/rust-lightning that referenced this pull request Sep 15, 2020
After reviewing lightningdevkit#494 there was design feedback on the use of Vec vs.
arrays in this layer of the code for fragmentation reasons. To get ahead
of the same issues in this new state machine and to limit the behavior
changes from master, remove Vec from the state machine in favor of thin
wrappers around fixed-size arrays.

This patch reintroduces the Act object (a thin wrapper around fixed size
arrays) with some convenience features to make them a bit easier to
pass around and build.

The Handshake code is still note Vec-clean as the encryption code uses
Vecs during encryption, but it is one step closer to passing back slices
to the peer_handler.
@ariard
Copy link

ariard commented Sep 17, 2020

Closing, as it's superseded.

@ariard ariard closed this Sep 17, 2020
julianknutsen added a commit to julianknutsen/rust-lightning that referenced this pull request Sep 18, 2020
After reviewing lightningdevkit#494 there was design feedback on the use of Vec vs.
arrays in this layer of the code for fragmentation reasons. To get ahead
of the same issues in this new state machine and to limit the behavior
changes from master, remove Vec from the state machine in favor of thin
wrappers around fixed-size arrays.

This patch reintroduces the Act object (a thin wrapper around fixed size
arrays) with some convenience features to make them a bit easier to
pass around and build.

The Handshake code is still note Vec-clean as the encryption code uses
Vecs during encryption, but it is one step closer to passing back slices
to the peer_handler.
julianknutsen added a commit to julianknutsen/rust-lightning that referenced this pull request Oct 2, 2020
After reviewing lightningdevkit#494 there was design feedback on the use of Vec vs.
arrays in this layer of the code for fragmentation reasons. To get ahead
of the same issues in this new state machine and to limit the behavior
changes from master, remove Vec from the state machine in favor of thin
wrappers around fixed-size arrays.

This patch reintroduces the Act object (a thin wrapper around fixed size
arrays) with some convenience features to make them a bit easier to
pass around and build.

The Handshake code is still note Vec-clean as the encryption code uses
Vecs during encryption, but it is one step closer to passing back slices
to the peer_handler.
julianknutsen added a commit to julianknutsen/rust-lightning that referenced this pull request Oct 2, 2020
After reviewing lightningdevkit#494 there was design feedback on the use of Vec vs.
arrays in this layer of the code for fragmentation reasons. To get ahead
of the same issues in this new state machine and to limit the behavior
changes from master, remove Vec from the state machine in favor of thin
wrappers around fixed-size arrays.

This patch reintroduces the Act object (a thin wrapper around fixed size
arrays) with some convenience features to make them a bit easier to
pass around and build.

The Handshake code is still note Vec-clean as the encryption code uses
Vecs during encryption, but it is one step closer to passing back slices
to the peer_handler.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants